Approaches That Use Domain-Specific Expertise: Behavioral-Cloning-Based Advantage Actor-Critic in Basketball Games
نویسندگان
چکیده
Research on the application of artificial intelligence (AI) in games has recently gained momentum. Most commercial still use AI based a finite state machine (FSM) due to complexity and cost considerations. However, FSM-based decreases user satisfaction given that it performs same patterns consecutive actions situations. This necessitates new approach applies domain-specific expertise existing reinforcement learning algorithms. We propose behavioral-cloning-based advantage actor-critic (A2C) improves performance by applying behavioral cloning algorithm an A2C basketball games. The normalization, reward function, episode classification approaches are used with A2C. results comparative experiments traditional algorithms validated proposed method. Our method using solved difficulty
منابع مشابه
Actor-Critic Reinforcement Learning with Neural Networks in Continuous Games
Reinforcement learning agents with artificial neural networks have previously been shown to acquire human level dexterity in discrete video game environments where only the current state of the game and a reward are given at each time step. A harder problem than discrete environments is posed by continuous environments where the states, observations, and actions are continuous, which is what th...
متن کاملHierarchical Actor-Critic
The ability to learn at different resolutions in time may help overcome one of the main challenges in deep reinforcement learning — sample efficiency. Hierarchical agents that operate at different levels of temporal abstraction can learn tasks more quickly because they can divide the work of learning behaviors among multiple policies and can also explore the environment at a higher level. In th...
متن کاملProjected Natural Actor-Critic
Natural actor-critics form a popular class of policy search algorithms for finding locally optimal policies for Markov decision processes. In this paper we address a drawback of natural actor-critics that limits their real-world applicability—their lack of safety guarantees. We present a principled algorithm for performing natural gradient descent over a constrained domain. In the context of re...
متن کاملOff-Policy Actor-Critic
This paper presents the first actor-critic algorithm for off-policy reinforcement learning. Our algorithm is online and incremental, and its per-time-step complexity scales linearly with the number of learned weights. Previous work on actor-critic algorithms is limited to the on-policy setting and does not take advantage of the recent advances in offpolicy gradient temporal-difference learning....
متن کاملMean Actor Critic
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning. MAC is a policy gradient algorithm that uses the agent’s explicit representation of all action values to estimate the gradient of the policy, rather than using only the actions that were actually executed. This significantly reduces variance in the gradient updates and removes the n...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematics
سال: 2023
ISSN: ['2227-7390']
DOI: https://doi.org/10.3390/math11051110